EXPLORING INFORMATION RETRIEVAL BY LATENT SEMANTIC INDEXING AND LATENT DIRICHLET ALLOCATION TECHNIQUES
نویسندگان
چکیده
منابع مشابه
Dimensionality Reduction and Topic Modeling: From Latent Semantic Indexing to Latent Dirichlet Allocation and Beyond
The bag-of-words representation commonly used in text analysis can be analyzed very efficiently and retains a great deal of useful information, but it is also troublesome because the same thought can be expressed using many different terms or one term can have very different meanings. Dimension reduction can collapse together terms that have the same semantics, to identify and disambiguate term...
متن کاملIndexing by Latent Dirichlet Allocation and an Ensemble Model
The contribution of this paper is two-fold. First, we present indexing by Latent Dirichlet Allocation (LDI), an automatic document indexing method with a probabilistic concept search. The probability distributions in LDI utilizes those in Latent Dirichlet Allocation (LDA), which is a generative topic model that has been previously used in applications for document indexing tasks. However, those...
متن کاملThe Sensitivity of Latent Dirichlet Allocation for Information Retrieval
It has been shown that the use of topic models for Information retrieval provides an increase in precision when used in the appropriate form. Latent Dirichlet Allocation (LDA) is a generative topic model that allows us to model documents using a Dirichlet prior. Using this topic model, we are able to obtain a fitted Dirichlet parameter that provides the maximum likelihood for the document set. ...
متن کاملComparison of Information Retrieval Techniques: Latent Semantic Indexing and Concept Indexing
The task of information retrieval is to extract relevant documents for a certain query from the collection of documents. As large sets of documents are now increasingly common, there is a growing need for fast and efficient information retrieval algorithms. The algorithms we are dealing with are embedded in the vector space model. In this paper we compare two information retrieval techniques: l...
متن کاملOn the Performance of Latent Semantic Indexing based Information Retrieval
Conventional vector-based Information Retrieval (IR) models: Vector Space Model (VSM) and Generalized Vector Space Model (GVSM) represents documents and queries as vectors in a multidimensional space. This high dimensional data places great demands on computing resources. To overcome these problems, Latent Semantic Indexing (LSI), a variant of VSM, projects the documents into a lower dimensiona...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International Research Journal of Computer Science
سال: 2020
ISSN: 2393-9842
DOI: 10.26562/irjcs.2020.v0705.001